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The rapid expansion of e-learning platforms, where students can share their 
opinions and express their thoughts, has become a rich source of data for 
opinion mining and sentiment analysis. This study aims to develop an 
effective model for predicting students’ attitudes about e-learning, with a 
focus on mining opinions that indicate positive or negative sentiments. The 
study was implemented in two stages. The first stage aimed to discover the 
most popular platform used in e-learning at the University of Mosul to 
collect the largest amount of data through comments posted within the 
platforms, also to identify trends in students' opinions towards e-learning. 
The results show that the focus of both lecturers and students revolved 
around well-known platforms such as Google Classroom and Google Meet, 
both of which had relative importance (45.33% and 42.29%, respectively). 
The second stage uses a machine-learning algorithm on the data collected to 
determine the impact of e-learning on students. Also, two feature selection 
approaches, information gain (IG) and CHI statistics, were explored and 


enhanced in addition to hidden Model Markov (HMM) and support vector 
machine (SVM)-based hybrid learning strategy. As a result, an opinion 
mining method was used to assist developers in improving and promoting 
the quality of relevant services. 
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1. INTRODUCTION 

Tan et al. [1] define learning management system (LMS) as a platform that provides lecturers and 
students with educational materials online. LMS helps instructors follow their students' progress and 
minimize the use of in-person teaching. LMS assists the educational process by constituting a central location 
for accessing material online. E-learning supports the educational process and enhances creativity. It provides 
guidance, counseling, examination organization, and management and evaluation of resources and strategies, 
known as e-Learning [2]. 

The e-learning review aims to support Iraqi faculty in successfully integrating e-learning broadly 
across Iraqi campuses. Faculty will understand e-learning concepts and methods, e-content, effective online 
pedagogy, and course design. Faculty will understand e-learning within the Iraqi higher education context 
and explore technologies relevant to online learning. Balogun and Ahlan [3] clarified that the need for e- 
learning has grown due to the change that occurred in the expectancies of training procedures and 
consequences. For example, many institutions have realized that education is no more the memorization of 
expertise; however, alternatively, the capability to solve the problem with novelty and verbal exchange 
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competencies. Ameen et al. [4] identified the benefits that e-learning, from students' viewpoint in Iraqi 
universities, will bring to higher education in Iraq. Furthermore, they focus on the challenge students can face 
using e-learning systems in Iraqi universities. 

Abdulrazak and Ali [5] attempted to target the problems that have been experienced with e-learning 
in Iraq and offering some suggestions that might contribute to reducing this delay to support the reality of 
education. Hundreds of professors, students, and teachers were involved in this research, and they are entirely 
aware of the basic implementation of electronic learning and the reasons for its delays. Several empirical 
studies have focused on challenges and Barriers, and opportunities of e-Learning Implementation in Iraq. Al- 
Azawei et al. [6] focus on the level of staff members and students' adoption, application, familiarity, and 
technology acceptance, confirming that social media is less complicated to use than e-learning platforms like 
Moodle. At the same time, Qureshi et al. [7] discovered that Electricity loss and English proficiency had been 
shown as the essential Barriers to efficient e-learning integration. 

Karkar et al. [8] tried to compare teachers (trained and untrained) on their learning management 
system. They found that teachers who joined LMS training sessions had a higher degree of LMS operation. 
Educated teachers appeared to make relatively more lavish use of 'grade core' and 'evaluation tool' but 
comparatively used less 'content' in their teaching compared to other teachers who did not join any training 
workshop. There is an opportunity for social interactions in an online classroom, and the problem is that the 
communication is done through a written text. Therefore, there is a missing factor: body language and 
gestures, and instant feedback from the student "listener" [9] discuss the difficulties faced by online 
instructors in viding clear and visible guidance in an online environment. The asynchronous classroom looks 
a lot different than a traditional lecture because there is a focus on active learning, for instance, using a video 
mix, assignments, discussion, chat, and peer review. It is possible to divide lessons into smaller parts to 
manage time and communication with students [10]. 

Paula [11] suggests that the role and responsibilities of the instructor are critical factors for a 
successful lesson. Knowledge and skills are needed for effective teaching. The instructor needs to engage 
students in online teaching and learning [11]. Due to the rapid expansion of e-learning platforms, and the 
need for extracting the attitudes, opinions, and expectations of both lecturers and students, therefore it 
became vital to create an accurate and valid tool to predict such feedback. This study aims to develop an 
effective model for predicting students' attitudes about e-learning, with a focus on mining opinions that 
indicate positive or negative sentiments. Opinion mining also referred to as sentiment analysis or sentiment 
classification, is computer technology for understanding and interpreting opinion and sentiment by 
processing vast volumes of data in a way that allows humans to make decisions [12]. 

In the context of e-learning, sentiment analysis refers to the use of an automatic text analysis process 
to extract opinions and identify a wide range of comments made in e-learning forums and websites where 
learners discuss or describe their thoughts, opinions, and evaluations of the services provided. Early detection 
of customer complaints and service issues does, in fact, aid in reducing the risk of broadly spreading 
defective items and improving advertising methods [13]. 

Sentiment analysis is a classification problem. The aim is to estimate the sentiment polarity and then 
categorize it into positive and negative feelings to discover attitudes and viewpoints expressed in any form or 
language [14]. Sentiment analysis of e-learning platforms provides platform developers with a good 
perspective and a quick and effective tool to track public perceptions of these platforms, among other things. 
Corporations, educational institutions, and individuals alike have shown a strong interest in e-learning. E- 
learning systems are becoming increasingly prominent as an educational trend. It is a term used to describe 
teaching methods involving computers to convey knowledge in a nontraditional classroom setting. In e- 
learning, it is vital to be aware of users' perspectives and design an evaluation based on them [15]. 

Message boards or blogs are essential for e-learning system makers to employ as a mechanism for 
users to communicate, discuss, or share their ideas and comments on the services. Conducting systematic 
surveys of users' views and opinions is one of the essential roles of e-learning management to meet their 
wants and requirements as effectively as possible. This method allows people in charge of e-learning to 
immediately detect and address any potential issues during the program's execution. This research aims to 
create a training sentiment classification algorithm that will categorize student opinions on e-learning system 
services into positive and negative categories. As previously stated, many online reviews on e-learning blogs 
and forums are currently beyond the reach and visual capabilities. As a result, there is a pressing need for 
innovative systems that can automatically analyze and interpret the attitudes expressed in reviews by users 
(learners or instructors). As a result, improved sentiment classification algorithms that can automatically 
analyze whether the overall reviews of a specific e-learning system are positive or negative based on the 
examined e-learning blogs would be highly beneficial to developers. Due to the domain specificity of e- 
learning review mining, the performance of sentiment classification algorithms on e-learning system reviews 
should be examined. To alter and improve teaching methods and procedures by developing a conceptual 
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framework that can extract, analyze, and forecast user sentiment (1.e., learners and tutors). As a result, the 
research's main contribution can be applied to various situations. 

This platform has quickly become a gold mine for firms trying to monitor their reputation and 
brands by gathering and analyzing public opinion about them, their markets, and competitors, with millions 
of users and millions of messages posted every day. This study uses sentiment analysis to determine students’ 
and teachers' attitudes regarding distance learning. Evaluation of the word bag features of text using 
supervised machine learning techniques [16] is a systematic way to classify emotions. All words can be 
filtered from a word bag vector using this method. Each text's appearance of words in such vectors is 
displayed as a textual feature. The phrase syntax tree [16] is another way to design an emotion classifier. The 
sentence is parsed to create a syntax tree representing the word relationships. Can then use the relationship 
between word polarity, POS properties, and syntax to build a model or pattern for an emotion classifier. Bag 
and words feature vectors are two distinct techniques to generate words. Dave et al. [17] also used machine 
learning approaches to investigate sentiment classification. Instead of employing all the words, they choose 
the top based on their generated scores. 

Mullen and Collier [18] employ support vector machine (SVM) algorithm to examine word 
sentiment polarity, as well as subject and artist-oriented data. Anjaria and Guddeti [19] study presents a new 
way to predict the election outcome by leveraging influential user factors. It also provides a hybrid method of 
extracting opinions from Twitter data using direct and indirect characteristics using a SVM, naive bays, 
maximum entropy, and a supervised classifier based on artificial neural networks. Increase, Principal 
component analysis (PCA) has been integrated with SVM to reduce dimensions. da Silva et al. [20] used 
classifier ensembles and lexicons; the research provides a strategy for automatically classifying the sentiment 
of tweets. About a query term, tweets are categorized as positive or negative. Multiple learners are taught to 
tackle the same problem by using ensemble approaches. As shown in Figure 1. 


. # he 
I'mina ee | mood ee) ositive 
ee | I go to beach. negative ` Ea p 
Classifier 3 a | 


Figure 1. The combination rule is an example of majority voting, the majority of classifiers agree that the 
class is positive 


Jain and Katkar [21] propose a mechanism for predicting Indian people's overall attitudes and 
predisposition toward political situations and issues. The suggested system flow is depicted in Figure 2. 
Twitter API v 1.1 is used to obtain raw training tweets. Following the collection of raw tweets, various 
preprocessing methods are used to sanitize the data. The same methods are used to gather and sanitize raw 
tweets to prepare the testing dataset. Various classifiers are used to examine the performance of classifiers 
after the training and testing datasets have been prepared. Saleh et al. [22], determine how the speaker or 
writer feels about many facets of a situation. In Figure 3, they have built the opinion mining process, in 
which each part has the following responsibilities: data collection, opinion identification, aspect extraction, 
opinion classification, production summary, and evaluation. This paper proposes data mining method, which 
has recently been researched in the business and computer realms, as a technical issue to be applied in the e- 
learning process, thereby introducing a new area of study. We must apply appropriate sentiment classification 
algorithms to mining evaluations collected from e-learning blogs and forums to study the data generated by 
this method. 

The motivation behind this work is the e-learning platform has evolved into a comprehensive and 
diverse information resource. Due to the nature of documents, which enable users to post real-time remarks 
about their ideas on various issues, discuss current events, complain, and show a positive feeling for goods 
they use daily. Companies that create comparable gadgets have started polling these messages to see how 
people feel about them. Customers' reactions are often analyzed and responded to on these platforms by these 
firms. 
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Figure 2. The figure shows a mechanism for predicting Indian people's overall attitudes and predisposition 
toward political situations 
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Figure. 3. Opinon mining process 


2. METHOD 

This study has two stages: first is to identify the most popular e-learning platform among students 
and tutors to extract many reviews. The second stage (explaind in Results and discussion) is to analyze the 
collected reviews from these platforms and other resources to predict students' and tutors’ attitudes toward e- 
learning, which can be extremely useful to businesses looking to improve their products. First Stage included 
the following sections to identify the most popular e-learning platform. 


2.1. Materials and methods 

The first stage of the search determines how difficult educational platforms and e-learning 
technologies are to utilize. For example, it tries to identify the most popular video platforms and their 
characteristics from both the professor's and the student's perspectives. It also investigates the relationship 
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between using these tools and the development of recommendations to improve the e-learning environment 
in Iraqi universities, namely the University of Mosul. It is worth noting that the majority of those who 
answered the questionnaire are professors and students at the University of Mosul, as the survey 
encompassed all 22 of the university's colleges and centers with various specializations. 


2.2. The questionnaire 

Due to COVID 19 restrictions, study restrictions were concentrated mainly at the University of 
Mosul and other Iraqi universities. The current research examines the difficulties in implementing e-learning 
tools and approaches from both the professor's and students' perspectives. A survey was created to gather 
thoughts and experiences from professors and students on LMS. Nearly 500 teachers and students took part 
in a discussion on their recent LMS experience, including what platform, applications, and tools they prefer 
to utilize. 


2.3. Analytical hierarchical process AHP 

The data was analyzed using the analytic hierarchy process (AHP) approach. AHP was established 
by Saaty in 1980. According to Ahmed and Al-Moula [23], AHP uses pair-wise comparison to extract 
relative weights from the factors in question. This method consists of three steps: first, organize parameters 
and assign importance degrees to them, then adopt a matrix of relative weights. The consistency ratio (CR) 
determines the degree of importance in the third phase. If CR is less than 0.1, the values do not need to be 
reweighted. The scale of relative importance is arranged (from 1 to 9), with 1 denoting the least important 
and 9 denoting the most important. 


3. RESULTS AND DISCUSSION 
3.1. Data tabling 

The results show that the focus of both lecturers and students revolved around well-known 
platforms, as in google classroom with relative importance (45.33%) and google meet with relative 
importance (42.29%), as in Tables 1, 2, and 3. Most lecturers preferred not to use other explanation board 
applications due to a lack of knowledge of such applications and their benefits. Also, the results show most 
academic staff must maintain their knowledge and teaching abilities as well as engage in positive 
professional behavior with key e-learning apps that can benefit both the instructor and the student for the 
curriculum to be effectively delivered. 


3.2. Data analysis 

Individuals, things, or people who make up the subject of the research problem are referred to as the 
research community. Members of the teaching staff and students (undergraduate and graduate) at the 
University of Mosul and other Iraqi universities are the subjects of this study. A Case Study was curried out 
as a questionnaire survey was set to examine the essential platforms used in digital education and other 
educational aspects of the E-learning process. The questionnaire was provided to a group of specialist experts 
to obtain their feedback on the questionnaire's clarity and coherence and its applicability for assessing the 
research variables. Most of the feedback was good. This survey was sent to a variety of colleges and 
universities. This questionnaire had numerous components that covered faculty members and students who 
took part in the survey. It is critical to consider the roles, responsibilities, knowledge, and abilities required 
for a successful online education. A questionnaire is a tool for gathering introductory instructor comments. 
Figure 4 shows the most used platform in Iraqi universities especially, within Mosul university. The results of 
using the AHP approach to evaluate the data suggest that both professors and students focused on well- 
known platforms such as Google Classroom. 


Table 1. Used tools by lecturers and students 


MEET ZOOM FCC webix WhatsApp Facebook Row Relative 
Messanger Total Importance 

1 MEET 1 7 8 9 9 9 43.00 47.98% 
2 ZOOM 1/7 1 2 8 6 4 21.14 23.59% 
3 FCC 1/8 1/2 1 7 3 2 13.63 15.20% 
4 webix 1/9 1/8 1/7 1 1 0.5 2.88 3.21% 
5 WhatsApp 1/9 1/6 1/3 1 1 0.5 3.11 3.47% 
6 Facebook Messanger 1/9 1/4 1/2 2 2 1 5.86 6.54% 


Consisty Ratio = 0.0692 
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Table 2. Used explanation board application 


no Janboared Powerpoint Drawchat White- Sketch Educreations Cisco Powtoon Row Relative 
board board Webex Total Importance 
Board 
1 No 1 9 9 9 9 9 9 9 9 73.00 39.90% 
2 Janboared 1/9 1 3 3 6 6 7 8 9 43.11 23.57% 
3 powerpoint 1/9 1/3 1 1 2 3 3 4 4 18.44 10.08% 
4 Drawchat 1/9 1/3 1 1 2 2 3 3 3 15.44 8.44% 
5 White-board 1/9 1/6 1/2 1/2 1 1 2 2 2 9.28 5.07% 
6 Sketch 1/9 1/6 1/3 1/2 1 1 2 2 2 9.11 4.98% 
board 
7 Educreations 1/9 1/7 1/3 1/3 1/2 1/2 1 1 1 4.92 2.69% 
8 Cisco 1/9 1/8 1/4 1/3 1/2 1/2 1 1 1 4.82 2.63% 
Webex 
Board 
9 Powtoon 1/9 1/9 1/4 1/3 1/2 1/2 1 1 1 4.81 2.63% 
Consisty Ratio = 0.0480 
Table 3. Total platforms by lecturers and students 
Google Edmodo moodle canvas’ easyclass sakai Row Total Relative 
Classroom Importance 
1 Google Classroom 1 6 7 9 9 9 41.00 42.29% 
2 Edmodo 1/6 1 4 8 8 9 30.17 31.11% 
3 moodle 1/7 1/4 1 4 4 3 14.39 14.85% 
4 canvas 1/9 1/8 1/4 1 1 2 4.49 4.63% 
5 easyclass 1/9 1/8 1/4 1 1 2 4.49 4.63% 
6 sakai 1/9 1/9 1/5 1/2 1/2 1 2.42 2.50% 
Consisty Ratio = 0.0888 
80,00% 
% 60,00% 
k 
2 40,00% 
& 
o 
& 20,00% 
=) 27) C 
Fy =< = — Z ~~ ~~ — ~~ ~~ ` 2) 


0,00% 


E_Learning platforms 


Figure 4. This figure depicts the most platform used in iraqi universities spicially within the university of 
Mosul 


3.3. Sentiment analysis (Stage 2) 

A trained statistical classifier is used for sentiment classification in supervised machine learning. 
The machine learning techniques are implemented using traditional bag-of-features architecture. This 
research proposes a method for predicting student tutors, especially Mosul University's overall attitudes on e- 
learning issues and scenarios. The planned gadget flow is depicted in Figure 5. 

The raw training text messages are gathered with the use of API tools. After gathering raw data, 
various preprocessing methods are used to clean up the data. It might not be accessible when dealing with 
large amounts of data manually. However, using AI techniques such as sentiment analysis, we can detect the 
emotional tone in a text message in real-time, at scale, and with high accuracy. 


3.3.1. Data preprocessing 
Sometimes, text messages are not in a usable format. Various preprocessing methods for cleaning 
documents are used to acquire a document in a readable format. Stop words, and special characters are 
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removed, spelling is corrected with a dictionary, abbreviations and slangs are replaced with expansions, and 
lemmatization is performed. The comments are stripped of any special characters and hyperlinks. In addition, 
duplicate messages are deleted from the training dataset. 


3.3.2. Preprocessing of text 

The goal of the sentiment analysis is to classify documents as either positive or negative attitudes by 
analyzing the data that has been posted. Text preprocessing and text categorization is done to accomplish 
this. Several preprocessing processes are carried out, as follows: 

- Stop word removal: Words that aren't extremely significant in emotion classification are referred to as 
noise words. We've compiled a list of terms, such as prepositions, pronouns, and adverbs, that mostly 
comprise English pronouns, particles, special characters, and numbers. 

-  Tokenization: Each text post is divided into tokens, which are significant words. 

- Data standardization: All words in a document are converted to lower case using data standardization 
techniques to ensure data uniformity. 

= Stemming: Porter stemmer is applied to data to return each word to its stem and remove suffixes such as 
(-Ed, -ing, -ion, etc.) to reduce the document's complexity and processing time, thereby improving the 
model's performance. 


Training dataset (corpuse of E-Learning Reviews) 


Trainig 


Process 


Tokinization Stemming 


Data Standarization Abbraviations processing 


Stop Word Removal Tokinazation 


Predicting 
(Positive/Nagitive) 


Figure 5. The proposed methodology 


3.3.3. Feature selection 

With the help of feature selection techniques, the number of input variables can be reduced to those 
that are thought to help a model predict the target variable the most effectively. A further advantage of 
feature selection is model interpretation. The output model is simpler and easy to understand with fewer 
characteristics. Feature selection methods are used to pick out discriminating terms for training and 
classification. 
a. TFIDF 

Sentiment analysis is used to solve a classification problem, typically a binary problem with positive 
and negative goal values. The machine learning model must first learn the sentiment score of each unique 
word in the document and how many times each word appears to make sentiment predictions on each text 
using sentiment analysis. We must specify features and target values to train the model when dealing with a 
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supervised machine learning challenge. The model's features are vectorized text data that has been modified. 
TFIDF is one of the vectorizers that construct the features differently. 

It's important to note that the techniques term frequency and inverse document frequency (TFIDF) 
vectorizer were utilized as input characteristics. The following formula can be used to compute the TF/IDF: 


1+log(ti ws) 
14og(D) tiwj) 


Xiwj = log( ) (1) 


Èwjtiwj 
where tj; is the number of times the term wj appears in document i. As illustrated in the equation, the first 
term calculates the term frequency, whereas the second term calculates the inverse document frequency. The 
first term evaluates the number of times the word wj appears in document I normalized by the document's 
length. 

b. N-Gram 

An n-gram is a sequence of tokens or words of length n utilized in numerous text mining and natural 
language processing activities. The proposed technique constructs an n-gram for post pieces to extract 
keyword characteristics from a post. N-Gram features are extracted after the data has been preprocessed. 

When n=2 is used, a two-word sequence is generated for each document. This step improves the 
classifier's accuracy because of the obtained information or features from two sequences of word pairings. 
Due to many factors, N-gram based extraction has been shown to have a reliable performance in extracting 
features from text: 

- The most common roots in text data are automatically captured. 
-  N-good gram's representation does not necessitate the use of a specific vocabulary. 
- It has a high tolerance for distortion and misspellings. 

Feature selection can be considered as a search issue, with each state in the search space defining a 
subset of the features that are available. If a data set comprises three features Al, A2, and A3, and the 
existence of a feature is coded with 1 and its absence with 0, then there should be a total of 23 reduced- 
feature subsets coded with {0, 0, 0}, {1, 0, 0}, {0, 1, O}, {0, 0, 1}, {1, 1, O}, {1, 0, 1}, {0, 1, 1}, and {1, 1, 
1}. When the search space is limited, the problem of feature selection is relatively simple because we can 
investigate all subsets in any order and the search will be completed quickly. The search space, on the other 
hand, is frequently quite large. In typical data-mining applications, the number of dimensions N is 2N. It is 
required to extract specific clues from the text to do machine learning. These hints may lead to effective 
proper classification. Feature vectors, (f1; f2; : : fn). A feature vector's coordinates reflect one clue, also 
known as a feature, "fi," from the original text. 

This section will discuss a few different methods for selecting characteristics. The word-goodness 
criterion threshold is used in both of the procedures in this study to accomplish a targeted degree of term 
deletion from a corpus document's full vocabulary register. The two prerequisites are information gain (IG) 
and CHI statistics. Each of these methods generates a score for each feature. To pick features appropriate for 
each technique, we can utilize a heuristics-based strategy, which we define as follows. The average score is 
derived for all features. 

The average score for each feature should be compared to the feature's score; if the feature's score is 
higher than the average, it should be picked. Additionally, by employing these feature selection procedures, 
the goal is to improve classification accuracy and get insight into the data. 

c. Information gain (IG) 

Information gain is also known as the goodness criteria in data mining [24]. Detecting the existence 
or absence of a specific/relevant word in a text estimates the quantity of data bits required for class 
prediction. 


IG(t) = -Fl p(cJlogp(ci) + p(t) UI, p(e\llogp(cilt) + OXI, p(cllogp(elA) (2) 


where p (c;) signifies a class's probability of occurrence class c;; The probability of a word t appearing is 
denoted by p(t). The likelihood of a word t not occurring is denoted by p(t). 
d. Statistics from CHI 

The CHI metrics can be used to evaluate a term's association to a category [16]. It's described as: 


E N X (AD-BE)? 
CHI(t, ci) = (A+E)X(B+D)X(A+B)X(E+D) (3) 
CHI max(t) = max;(CHI(t, c;)) (4) 
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where A represents the number of times t and c; occur together, and B represents the number of times t 
occurs without c;; E represents the number of times c;occurs without t; N is the total number of documents, 
and D represents the number of times neither c; nor t occurs. 


3.4. The proposed classifier 

There is a secret gem, which is worth noticing. Probabilistic automata and the Markov Model HMM 
are tightly intertwined. A probabilistic automaton is a structure made up of states and transitions and a set of 
transition probability distributions. The hidden Markov model (HMM) was employed to deal with sentiment 
text classification in this research. We intend to develop a HMM based on a collection of text reviews from a 
particular class Cj. We will compare our integrated supervised data mining technique based on Hidden Model 
Markov HMM and SVM to the previously hand-classified training data to see if the reviews are more likely 
to have a "positive" or "negative" orientation at this time [25]. 

Its purpose is to determine how to determine all of the Markov model parameters that govern this 
area. A hidden Markov model is defined as q and v symbols within vector visible symbols. Indeed, we have 
utilized the HMM's [25] statement model, which accepts all compound terms (bigrams and/or trigrams) as 
well as symbols representing all relevant terms (unigrams) for a given class. The best approach to change the 
parameters of an HMM classifier is to build Markov models. A (SVM) is used for sentiment classification to 
find a viable approach for data categorization. Texts are divided into two categories: positive and negative. 
According to the structural risk minimization principle of computational learning theory, SVM seeks a solid 
background for dividing training data points into two classes (positive and negative). Furthermore, make 
decisions based on the selected support vectors being the only elements in the training set that are exclusively 
effective. The SVM classifier's findings are acquired using ten-fold cross-validation. Training reviews 
account for 80% of the evaluations performed on each fold, with testing reviews accounting for 20%. As a 
result, we have written to support vectormachine (SVMlight) [20]. 

Our combination classifier is completely disclosed in the next section. As previously stated, several 
different combining rules can link various classifiers. In a nutshell, the framework for combining classifiers 
can be described as; Assume that the classifier-selection stage selects N individual classifiers Ck (k = 
1,....,N). Each classifier assigns a label LK (Lk = f1,...,fm) to one of the input samples (represented as Xk). 
Assume that the classifier Ck produces a measurement in the form of a posterior probability vector for each 
output. 


Pr = [PIX mle 


where (f;|X;,) indicates the likelihood that the classifier would consider X to be labeled with f;. 
Major voting rule: 


assign z > fi 
j = arg max Yit, A; (5) 


LL, = fi 


where A ; = l Leet 
l 


sum rule: 


assign z > f; 

j = arg max Ye, pfilXs) 
max rule: 

assign z > f; 


N 
j = arg max [max p ce (7) 
k=1 


mean tule. 


assign z > fi 
N 
j = arg max [mean p ixo) 9) 
k=1 


Joachims [25], we can show that under various combining conditions, the combined HMM+SVM 
approach outperforms the individual HMM and SVM classifiers. As a result, the best combination classifier 
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using the Sum rule appears to outperform the others (Major, Max, and Mean). Furthermore, for our combined 
classifier, the Unigrams, Bigrams, and Trigram's classifier obtains the best overall performance. This 
illustrates that increased n-grams order can help polaritycategorization, which explains its extensive use in 
our research. 


3.5. Experimental results 


3.5.1. Dataset 
We created a data collection for the current study by extracting a set of e-learning reviews from a 
range of e-learning blogs, such as http://elearningtyro.wordpress.com and 


(http://elearningtech.blogspot.com/). Text reviews were also acquired using Moodle forums 
(http://docs.moodle.org/en/Forums) and other platforms like Google Classroom, as well as the survey 
questionary shared with students and instructors via Google Forum. (The forum module involves students 
(learners) and teachers working together.) (Tutors can provide feedback in the form of comments.) It's vital 
to notice that we have 1000 positive and 1000 negative texts in our data set. People have a pessimistic 
attitude; in addition, 80 percent of the materials used are related to training, with the remaining 20% being 
about testing. 


3.5.2. Evaluation 

The confusion matrix, which puts the classifier's given class (column) against the samples, is the 
most used instrument for evaluating classification performance—their genuine, authentic class (row). A 
matrix was used to describe the confusion for the validation step, as shown in Table 4. 


Table 4. A confusion matrix 
Class Actual Positive Actual Negative 
Actual positive True positive (TP) False negative (FN) 
Actual negative _ False Positive (FP) _ True negative (TN) 


The efficacy of sentiment categorization is measured in terms of standard precision (P), recall (R), 
and F-measure (F), which are defined as: 


Precision(POSITIVE) = ——— (9) 
ai ; TN 
Precision(Negative) = ~x (10) 


Precision: the ratio of accurate cases through the system outputs is known as precision. 


TP 
TP+FN 


Recall = 


(1) 


Recall (also called sensitive): The ratio of accurately predicted positive observations to all 
observations in the actual class is known as recall.To combine these two metrics into a single value, the F- 
measure is usually utilized. In terms of importance, the F-measure compares the relevance of recall to 
precision. The F-measure is obtained by giving precision the same weight as recall: 


2xPrecesion»Recall 


F1 — Score = &— (12) 
Recall tPreçeston 
F1 — score = TP+I(FP+FN) (13) 


TP=number of true positives 
FP=number of false positives 
FN=number of false negatives 

In terms of precision, recall, and F-measure, IG is the best at identifying sentimental phrases and 
conducting sentiment classification in most situations, as evidenced by the trial findings Table 5.The 
accuracy of our blog-collected e-learning corpus does not appear to be particularly good. This emphasizes 
how challenging it is to maintain a blog. This is mostly due to the obnoxious content that suffocates learning 
on these blogs. The fact that the bloggers are not professional writers adds to the problem. They hail from a 
variety of eras, cultures, locations, and faiths. As a result, when writing blogs or blog comments, people may 
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not follow grammatical rules. Because they are accustomed to using Netspeak, even those who are fluent in 
English do not always write in a journalistic manner. Words are being shortened, and letters are being 
replaced with different letters and/or symbols. Known as NetSpeak, it is a method of reducing typing time. 
It's often difficult to decipher the true meaning of such characters or symbols. We discovered the frequent use 
of emoticons in e-learning blogs after examining them. 


Table 5. Two feature selection methods’ performance on our proposed learning model 


Positive Negative 
Precesion Recall Fl-Score Precesion Recall _ Fl-Score 
CHI 0.735 0.746 0.741 0.742 0.735 0.739 
IG 0.794 0.843 0.831 0.817 0.793 0.798 


4. CONCLUSION 

In terms of precision, recall, and F-measure, IG is the best at identifying sentimental phrases and 
conducting sentiment classification in most situations, as evidenced by the trial findings.The accuracy of our 
blog-collected e-learning corpus does not appear to. It turns out that using sentiment analysis to investigate 
the character and structure of web forums and e-learning blogs is a good idea. A significant undertaking, yet 
the present accuracy prospects for effective forum conversation analysis sentiments. This type of analysis can 
aid in the development of a better product. Gaining a grasp of users' perspectives on e-learning systems to 
make them better. Our research has shown that combining sentiment classification into e-learning is a viable 
option. We found that when two selected features (IG and CHI) were applied to the sentiment classification 
of e-learning blogs and forums, our based hybrid classifier HMM and SVM with the rule Sum outperformed 
CHI and delivered better results. Another significant occurrence is the unique issues connected with mining e- 
learning reviews and analyzing e-learning blogs, which makes the process more difficult and complex, and it 
is this component that is at the root of our loss of accuracy. 
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